About Plotly

This graphing library is free and open source. It allows the user to make many different types of interactive, publication-quality plots. Some of the many things we can create with Plotly are line plots, scatter plots, area charts, bar charts, error bars, box plots, histograms, heatmaps, subplots, multiple-axes, and 3D Charts. One benefit of Plotly is that it allows the user to hover their mouse over the plots and see data points, zoom in or out of specific areas, or capture stills.

plotly is a product of the tech company “Plotly” started by Alex Johnson, Jack Parmer, Chris Parmer, and Matthew Sundquist. The package is written in JavaScript, but can be used in other languages syntaxes such as R and Python.

Why use Plotly?

Data visualization acts as an accentuating yet critical component of data science. Although quantitative analysis and technical knowledge drive the extraction of insight from data, perhaps what acts as a just as equal, if not more, counterpart is the presentation of findings. Data visualization perfectly accentuates data analytics, as it allows for data scientists to visually present and communicate important findings within the data in a digestible manner to key stakeholders. Traditional data visualization packages, such as ggplot2 or plotting functions within native R, provide means and methods of accomplishing this. However, plotly's value proposition is the ability to expound upon tradition two dimensional plotting techniques by offering interactivity and dimensionality that allows both data scientists and key stakeholders alike to extract more complex and meaningful insights, beyond what is capable in 2d graphics.

How to Install Plotly

There are two ways of installing Plotly. One of the ways is by downloading from CRAN with the command install.packages("plotly"). However, this may not be the most updated version so the other method of installing the library is suggested. This method is by downloading it from Github with the command devtools::install_github("ropensci/plotly") . This will ensure the latest version is downloaded.

How to Load the Plotly Package

In order to use the Plotly package, the user must first load it in using the command library(plotly).

Some Important Info About Plotly

For an in-depth overview of the plotly package, click here.

Version History

Current.Version
4.9.2.1

Usage

plotly is a data visualization package that can be used to make the following types of graphics:

  • line plots
  • scatter plots
  • area charts
  • bar charts
  • error bars
  • box plots
  • histograms
  • heatmaps
  • subplots
  • multiple-axes
  • 3D Charts

Dependencies

  • R (≥ 3.2.0) - R version 3.2.0 or later
  • ggplot2 (≥ 3.0.0) - ggplot2 version 3.0.0 or later

Imports (Exogenous Package Use)

  • tools
  • scales
  • httr
  • jsonlite (≥ 1.6)
  • magrittr
  • digest
  • viridisLite
  • base64enc
  • htmltools (≥ 0.3.6)
  • htmlwidgets (≥ 1.3)
  • tidyr, hexbin
  • RColorBrewer
  • dplyr
  • tibble
  • lazyeval (≥ 0.2.0)
  • rlang
  • crosstalk
  • purrr
  • data.table
  • promises

About our Data Set

The World Happiness Report is a survey of the state of global happiness. This survey ranks 155 countries depending on their level of happiness, and analyzes the factors that leads to a happier country/citizens. The happiness scores and rankings are determined form the Gallup World Poll. There are six factors that the happiness score estimates: economic production, social support, life expectancy, freedom, absence of corruption, and generosity. These are believed to all contribute, in some manner, to the increase or decrease of global happiness. Our aim, through using this dataset, is to gain a better understanding of the factors that lead to a happier country, while also showing the different capabilities that the Plotly package allows.

For an in-depth overview of the dataset used below, click here.

Examples of plots that can be made with plotly

Scatterplots

Plotly allows the user to make many different types of scatterplots. These can range from the basic 2D scatterplot analyzing two variables, to more complex 3D structures that look at the relationship between three variables. Below are some examples of the many types of scatterplots that can be produced with Plotly.

Basic 2D Scatter Plots

Plotly makes it extremely easy to create simple 2D scatterplots to see relationships between two variables.

fig <- plot_ly(data = nineteen, x = ~Healthy.life.expectancy, y = ~Overall.rank)
fig

This scatterplot shows that as Healthy Life Expectancy increases, the overall rank decreases.This means that countries with citizens who have higher Healthy Life Expectancies are also more likely to be the countries with a better ranking (lower, in this case, is better).

Styled 2D Scatter Plots

fig <- plot_ly(data = nineteen, x = ~Healthy.life.expectancy, y = ~Overall.rank,
               marker = list(size = 10,
                             color = '#BFDFFC',
                             line = list(color = '#067BEA',
                                         width = 2)))
fig <- fig %>% layout(title = 'Styled Scatter',
         yaxis = list(zeroline = FALSE),
         xaxis = list(zeroline = FALSE))

fig

Here, we have the same relationship being analyzed as we did in the Basic 2D Scatterplot, but we have styled the scatter plot through manipulations to the data markers.

2D Scatterplot with Quantitative Colorscales

Plotly also allows the user to have a colorscale dependent on a specific grouping. For Quantitative or Continuous groupings, the legend will be a scale that fades from one color to another to show varying levels of the grouping.

fig <- plot_ly(nineteen, x = ~Freedom.to.make.life.choices, y = ~Social.support, text = ~Country.or.region, type = 'scatter', mode = 'markers', color = ~Score, colors = 'Blues',
        marker = list(size = ~Score, opacity = 0.7,sizemode = 'diameter'))
fig <- fig %>% layout(title = 'Social Support and Freedom to Make Life Choices per Country',
         xaxis = list(showgrid = FALSE),
         yaxis = list(showgrid = FALSE))
fig

Here, we have an example of a qualitative colorscale where the scatterplot is broken down by the overall scores. The darker colors signify a higher overall score, and the lighter colors signify a lower overall score. We can see that the countries that have the highest freedom to make life choices and the highest social support are also the highest in their scores. Some of these countries are Iceland, Denmark, Finland, Australia, and Canada.

2D Scatterplot with Qualitative Colorscales

As discussed on the previous example, Plotly allows the user to have a colorscale dependent on a specific grouping. For Qualitative groupings, the legend will separate the colors based on the specific groupings.

fig <- plot_ly(data = nineteen, x = ~Healthy.life.expectancy, y = ~Score, color = ~Country.or.region , colors="Blues")
fig
## No trace type specified:
##   Based on info supplied, a 'scatter' trace seems appropriate.
##   Read more about this trace type -> https://plot.ly/r/reference/#scatter
## No scatter mode specifed:
##   Setting the mode to markers
##   Read more about this attribute -> https://plot.ly/r/reference/#scatter-mode

In the example above, we have grouped the points by Country/Region. We can see that each grouping (in this case country) was assigned a different shade of blue. As a result, we can analyze the Score vs Healthy Life Expectancy broken down by the Country.

3D Scatter Plots

With Plotly, the user can also create 3D scatter plots to analyze three different variables and their relationships.

fig <- plot_ly(nineteen, x = ~Healthy.life.expectancy, y = ~Perceptions.of.corruption,
               z = ~Social.support, color = ~Overall.rank, colors = c('#067BEA', '#464D52'))
fig <- fig %>% add_markers()
fig <- fig %>% layout(scene = list(xaxis = list(title = 'Healthy Life Expectancy'),
                     yaxis = list(title = 'Perceptions of Corruption'),
                     zaxis = list(title = 'Social Support')))
fig

This scatter plot shows the relationship between Perceptions of Corruption, Generosity, and Overall Rank. We can see that the countries with the best overall rank (aka lowest rank) are those with high social support, low perceptions of corruption, and high healthy life expectancy.

Bar Graphs

Basic Bar Graph with Direct Labels

This shows a bar chart for the 2019 data set. It analyzes the Country/Region and the Score associated with that location.

#Subsetting the data to only show the top 10 countries with the highest scores
nineteen<- nineteen %>%
    group_by(Score) %>%
    top_n(10, Score) %>%
    head(10)

x<- as.factor(nineteen$Country.or.region)
y<- as.numeric(nineteen$Score)
data <- data.frame(x, y)

fig1 <- plot_ly(data, x = ~x, y = ~y, type = 'bar',
             text = y, textposition = 'auto',
             marker = list(color = 'rgb(158,202,225)',
                           line = list(color = 'rgb(8,48,107)', width = 1.5)))
fig1 <- fig1 %>% layout(title = "Top Ten Happiness Scores Based on Country (for the year 2019)",
         xaxis = list(title = "Country/Region"),
         yaxis = list(title = "Score"))
fig1

Grouped Bar Graph

This shows a grouped bar chart for the 2019 data set. This shows a bar graph looking at the top 10 countries with the highest scores. The graph shows their healthy life expectancy, social support, and GDP per capita counts.

#Subsetting the data to only show the top 10 countries with the highest scores
nineteen<- nineteen %>%
    group_by(Score) %>%
    top_n(10, Score) %>%
    head(10)

x<- as.factor(nineteen$Country.or.region)
y<- as.numeric(nineteen$Score)
data <- data.frame(nineteen, x, y)

fig <- plot_ly(nineteen, x = ~Country.or.region, y = ~Healthy.life.expectancy, type = 'bar', 
            name = 'Healthy Life Expectancy', text = y, textposition = 'auto',
             marker = list(color = '#065198',
                           line = list(color = '#0F242A', width = 1.5)))

fig <- fig %>% add_trace(y = ~Social.support, name = 'Social Support',
            name = 'Score', text = y, textposition = 'auto',
             marker = list(color = '#CCE4FC',
                           line = list(color = '#0F242A', width = 1.5)))
fig <- fig %>% add_trace(y = ~GDP.per.capita, name = 'GDP per Capita',
            name = 'Score', text = y, textposition = 'auto',
             marker = list(color = '#32A6C6',
                           line = list(color = '#0F242A', width = 1.5)))
fig <- fig %>% layout(title = "Analyzing the Top 10 Countries (for the year 2019)",
         xaxis = list(title = "Country/Region"),
         yaxis = list(title = "Count"))

fig

As we can see above, of the ten groups, Iceland has the most Social Support, Norway has the most GDP per Capita, and Switzerland has the most Healthy Life Expectancy values.

Putting Two (or more) Graphs Side-By-Side

In order to see relationships or analyze data more clearly, the user can use Plotly to place two (or more) plots side-by-side.

#Subsetting the data to only show the top 10 countries with the highest scores
nineteen<- nineteen %>%
    group_by(Score) %>%
    top_n(10, Score) %>%
    head(10)
    
x<- as.factor(nineteen$Country.or.region)
y<- as.numeric(nineteen$Score)
data <- data.frame(x, y)

fig1 <- plot_ly(data, x = ~x, y = ~y, type = 'bar',
             name = '2019',
             text = y, textposition = 'auto',
             marker = list(color = '#065198',
                           line = list(color = '#0F242A', width = 1.5)))
fig1 <- fig1 %>% layout(title = "Top Ten Happiness Scores Based on Country (for the year 2019)",
         xaxis = list(title = "Country/Region"),
         yaxis = list(title = "Score"))

#Subsetting the data to only show the top 10 countries with the highest scores
eighteen<- eighteen %>%
    group_by(Score) %>%
    top_n(10, Score) %>%
    head(10)
    
x2<- as.factor(eighteen$Country.or.region)
y2<- as.numeric(eighteen$Score)
data2 <- data.frame(x2, y2)

fig2 <- plot_ly(data, x = ~x2, y = ~y2, type = 'bar',
             name = '2018',
             text = y2, textposition = 'auto',
             marker = list(color = '#CCE4FC',
                           line = list(color = '#0F242A', width = 1.5)))
fig2 <- fig2 %>% layout(title = "Top Ten Happiness Scores Based on Country (for the year 2018)",
         xaxis = list(title = "Country/Region"),
         yaxis = list(title = "Score"))

subplot(fig2,fig1) %>% layout(title ="Top Ten Happiness Scores Based on Country")

Here, we are using both the 2018 and 2019 datasets. We have made bar graphs to analyze general trends of their overall score based on the country/region. We can see that both years have similar results; however, in 2018, Norway was the country with the highest Happiness Score, and in 2019, Finland was the country with the highest Happiness Score.

Line Plots

Basic Line Plot

Plotly allows the user to create simple line plots to view the relationship between two variables.

nineteen<- read.csv('2019.csv')
fig <- plot_ly(nineteen, x = ~Overall.rank, y = ~Healthy.life.expectancy, type = 'scatter', mode = 'lines')
fig

This line plot allows us to see the general pattern that with overall rank increasing (getting worse), there’s a general pattern with the healthy life expectancy decreasing.

Line Plot Mode

Plotly also allows the user to change the mode of the lines.

nineteen<- read.csv('2019.csv')

fig <- plot_ly(nineteen, x = ~Overall.rank, y = ~GDP.per.capita, name = 'GDP Per Capita', type = 'scatter', mode = 'lines') 
fig <- fig %>% add_trace(y = ~Healthy.life.expectancy, name = 'Healthy Life Expectancy', mode = 'lines+markers') 
fig <- fig %>% add_trace(y = ~Social.support, name = 'Social Support', mode = 'markers')

fig

For example, in this example, we can see the modes are different for the different y variables that we are analyzing. As the overall rank increases, we can see it’s relationship with GDP Per Capita (straight blue line), Healthy Life Expectancy (orange straight line with dots), and Social Support (green dots). By using mode= ' ', the user can specify what they want their line plots to look like.

Line Graph with Restyle Button

With Plotly, the user can add a “restyle” method that can then modify the data/data attributes of the graph.

head(nineteen)
##   Overall.rank Country.or.region Score GDP.per.capita Social.support
## 1            1           Finland 7.769          1.340          1.587
## 2            2           Denmark 7.600          1.383          1.573
## 3            3            Norway 7.554          1.488          1.582
## 4            4           Iceland 7.494          1.380          1.624
## 5            5       Netherlands 7.488          1.396          1.522
## 6            6       Switzerland 7.480          1.452          1.526
##   Healthy.life.expectancy Freedom.to.make.life.choices Generosity
## 1                   0.986                        0.596      0.153
## 2                   0.996                        0.592      0.252
## 3                   1.028                        0.603      0.271
## 4                   1.026                        0.591      0.354
## 5                   0.999                        0.557      0.322
## 6                   1.052                        0.572      0.263
##   Perceptions.of.corruption
## 1                     0.393
## 2                     0.410
## 3                     0.341
## 4                     0.118
## 5                     0.298
## 6                     0.343
fig <- plot_ly(nineteen, x = ~Score)
fig <- fig %>% add_lines(y = ~ Freedom.to.make.life.choices)

fig <- fig %>% layout(
  title = "Freedom to Make Life Choices vs Score",
  xaxis = list(domain = c(0.1, 1)),
  yaxis = list(title = "Freedom to Make Life Choices"),
  updatemenus = list(
    list(
      type = "buttons",
      y = 0.8,
      buttons = list(

        list(method = "restyle",
             args = list("line.color", "blue"),
             label = "Blue"),

        list(method = "restyle",
             args = list("line.color", "red"),
             label = "Red")))
  ))

fig

Above, we have an example of how the user can modify the color of the Line Plot depending on the button they choose. The ‘Blue’ button will show the graph in a blue color, whereas the ‘Red’ button will show the graph in a red color. Here, we can see that as the happiness score of countries generally go up, their freedom to make life choices also goes up; there is a direct relationship.

Choropleth Maps in R

Basic Choropleth Map

#In order to make a choropleth map using plotly, we need to convert all 
#the country names that we have to their associated country abbreviation

#To convert the country names to their codes, load the "countrycode" package
library(countrycode)

nineteen<- read.csv('2019.csv')
nineteen$Country.or.region <- as.factor(nineteen$Country.or.region)
nineteen$Country.or.region<- countrycode(nineteen$Country.or.region, 'country.name', 'iso3c')

fig <- plot_ly(nineteen, type='choropleth', locations=nineteen$Country.or.region, z=nineteen$Score, text=nineteen$GDP.per.capita, colorscale="YlGnBu",reversescale = T)
fig <- fig %>% layout(
    title = "2019 Happiness Report Shown by GDP per Capita"
)
fig <- fig %>% colorbar(title = "GDP per Capita")
fig

This is an example of a choropleth map, and the user can hover over the countries to see the country name as well as the GDP per capita value for that place.

Mapping with Aggregations

Plotly–in addition to a library called listviewer which is designed to help see, understand, and maybe even modify your R lists–allows the user to make choropleth maps with a dropdown option of the operation they would like to perform to the variable they are exploring.

#We must import 'listviewer' in order to make a map with aggregations
library(listviewer)
eighteen<- read.csv('2018.csv')
nineteen<- read.csv('2019.csv')
all_years<- rbind(nineteen,eighteen)

s <- schema()
agg <- s$transforms$aggregate$attributes$aggregations$items$aggregation$func$values

l = list()
for (i in 1:length(agg)) {
  ll = list(method = "restyle",
            args = list('transforms[0].aggregations[0].func', agg[i]),
            label = agg[i]) 
  l[[i]] = ll
}

fig <- all_years %>%
  plot_ly(
    type = 'choropleth',
    locationmode = 'country names',
    locations = ~Country.or.region,
    z = ~GDP.per.capita,
    autocolorscale = F,
    reversescale = T,
    colorscale = 'YlGnBu', 
    transforms = list(list(
      type = 'aggregate',
      groups = ~Country.or.region,
      aggregations = list(
        list(target = 'z', func = 'sum', enabled = T)
      )
    ))
  )
fig <- fig %>% layout(
    title = "<b>World Happiness (2018-2019)</b>",
    geo = list(
      showframe = F,
      showcoastlines = F
    ),
    updatemenus = list(
      list(
        x = 0.25,
        y = 1.04,
        xref = 'paper',
        yref = 'paper',
        yanchor = 'top',
        buttons = l
      )
    )
  )

fig

In this example above, we are looking at the relationship of GDP per capita in the years 2018-2019. You can see, that depending on the different operations (max, min, std dev, etc), the values on the choropleth map change accordingly. The user can also hover over the map to see the country name and specific value.

Not a fan of Plotly? Here are some alternatives!

ggplot2


The ggplot2 package was created by Hadley Wickham and offers yet another powerful way of creating complex but elegant visualizations and graphics. Plotly and ggplot2 have a lot of differences in their syntax, customization, and documentation.

To install the ggplot2 package and ensure you properly load it in, use the following commands:
1) install.packages("ggplot2")
2) library(ggplot2)

leaflet


The leaflet package was created by

To install the leaflet package and ensure you properly load it in, use the following commands:
1) install.packages("leaflet")
2) library(leaflet)